15 research outputs found

    Estimation of the Handwritten Text Skew Based on Binary Moments

    Get PDF
    Binary moments represent one of the methods for the text skew estimation in binary images. It has been used widely for the skew identification of the printed text. However, the handwritten text consists of text objects, which are characterized with different skews. Hence, the method should be adapted for the handwritten text. This is achieved with the image splitting into separate text objects made by the bounding boxes. Obtained text objects represent the isolated binary objects. The application of the moment-based method to each binary object evaluates their local text skews. Due to the accuracy, estimated skew data can be used as an input to the algorithms for the text line segmentation

    Examining the impact of Ground Control point quantity on the geometric accuracy of UAV photogrammetric products formed using Structure-from-Motion approach

    Get PDF
    The positional and vertical accuracy of UAV aerial photogrammetry products generated using the Structure from Motion (SfM) approach depends on various factors, such as flight plan parameters, camera quality, camera calibration, the SfM algorithm used, and the georeferencing process. The influence of the quantity of Ground Control Points (GCPs) on the geometric quality of generated models and the stability of camera calibration parameters assessed through self-calibration in the block-aerotriangulation process was investigated in this study. Three software systems were used to process the collected UAV photogrammetry images: Pix4D Mapper, Agisoft Metashape, and Trimble Inpho UASMaster. Standard statistical quality assessments were employed to assess the accuracy of the block-aerotriangulation. The research findings indicate that augmenting the quantity of GCPs enhances model reliability and decreases the RMSE values of vertical deviation on the control points. The RMSE values of vertical deviation on the check points for all three used software systems converged to approximately twice the value of the average spatial resolution. Additionally, the RMSE values of positional deviation on check points converged to the value of the average spatial resolution

    Refinement of Individual Tree Detection Results Obtained from Airborne Laser Scanning Data for a Mixed Natural Forest

    Get PDF
    Numerous semi- and fully-automatic algorithms have been developed for individual tree detection from airborne laser-scanning data, but different rates of falsely detected treetops also accompany their results. In this paper, we proposed an approach that includes a machine learning-based refinement step to reduce the number of falsely detected treetops. The approach involves the local maxima filtering and segmentation of the canopy height model to extract different segment-level features used for the classification of treetop candidates. The study was conducted in a mixed temperate forest, predominantly deciduous, with a complex topography and an area size of 0.6 km × 4 km. The classification model’s training was performed by five machine learning approaches: Random Forest (RF), Extreme Gradient Boosting, Artificial Neural Network, the Support Vector Machine, and Logistic Regression. The final classification model with optimal hyperparameters was adopted based on the best-performing classifier (RF). The overall accuracy (OA) and kappa coefficient (κ) obtained from the ten-fold cross validation for the training data were 90.4% and 0.808, respectively. The prediction of the test data resulted in an OA = 89.0% and a κ = 0.757. This indicates that the proposed method could be an adequate solution for the reduction of falsely detected treetops before tree crown segmentation, especially in deciduous forests

    Estimation of the Fundamental Frequency of the Speech Signal Compressed by MP3 Algorithm

    No full text
    The paper analyzes the estimation of the fundamental frequency from the real speech signal which is obtained by recording the speaker in the real acoustic environment modeled by the MP3 method. The estimation was performed by the Picking-Peaks algorithm with implemented parametric cubic convolution (PCC) interpolation. The efficiency of PCC was tested for Catmull-Rom, Greville, and Greville two- parametric kernel. Depending on MSE, a window that gives optimal results was chosen

    Semantic segmentation of airborne laser scanning point clouds using machine learning methods

    No full text
    Технологија ласерског скенирања (енгл. Light Detection and Ranging – LiDAR) показала се као веома успешна за брзо прикупљање масовне количине просторних података о топографији физичке површи Земље. Семантичка сегментација облака тачака добијеног ласерским скенирањем из ваздуха (енгл. Airborne Laser Scanning – ALS) која се такође назива и класификација облака тачaка, семантичко означавање као и семантичка класификација облака тачака, представља велики изазов због структуре облака тачака и типова класа које се могу идентификовати у том простору. Машинско учење, са друге стране, представља моћан математички апарат који се може искористити за различите примене укључујући и наведену процедуру. У овој дисертацији су анализиране методе машинског учења којим се добијају најбољи резултати семантичке сегментације облака тачака, поготoво са сложеним ансамбл моделима машинског учења конструисаним слагањем више основних модела машинског учења. У овом истраживању је вршено и балансирање скупа података синтетичким генерисањем тачака које припадају мањинским класама док су тачке које припадају већинским класама знатно редуковане. Вршенe су анализа типа претраге тачака суседства и анализа утицаја величине полупречника претраге, а испитана је и могућност вишеразмерног (енгл. multiscale) приступа претраге у циљу генерисања геометријских атрибута (карактеристика) тачака. Одређен је велики број различитих атрибута тачака и извршена селекција оних најзначајнијих за семантичку сегментацију облака тачака. Извођена је семантичка сегментација облака тачака коришћењем десет различитих метода машинског учења. Највиша укупна тачност (енгл. Overall Accuracy – OA) семантичке сегментације облака тачака добијеног ласерским скенирањем из ваздуха била је 83.5% за методу потпорних вектора (енгл. Support Vector Machine) примењену на ISPRS тест податке, док је над GRSS тест подацима ласерског скенирања постигнута укупна тачност од 93.6% када се користи сложени ансамбл модел базиран на наивном Бајесу (енгл. Naive Bayes) и слагању модела: случајне шуме (енгл. Random Forest), градијентног појачавања (енгл. Gradient Boosting) и логистичке регресије (енгл. Logistic Regression). У неким применама, сегментација облака тачака добијеног ласерским скенирањем из ваздуха подразумева и издвајање објеката од интереса (крова зграде, крошње дрвета и слично). У оквиру овог истраживања обрађена је сегментација облака тачака пошумљеног подручја добијеног ласерским скенирањем са циљем детекције појединачних стабала. Наведени приступ подразумева филтрирање локалних максимума и сегментацију појединачних крошњи стабала на основу висинског модела крошњи стабала. Претходни поступак је битан за генерисање различитих типова атрибута на нивоу сегмената крошњи стабала које се користе за каснију класификацију кандидата врхова стабала у исправно и погрешно детектоване. Истраживање је спроведено за подручје мешовите шуме, претежно лишћарске, VIII сложене топографије и димензија 0.6 km × 4 km. Испитиване су перформансе класификације за пет метода машинског учења: случајнe шумe, екстремно градијентно појачавање (енгл. Extreme Gradient Boosting), вештачке неуронске мреже (енгл. Artificial Neural Network), методе потпорних вектора и методе логистичке регресије. Овде је такође вршено балансирање класа скупа података у циљу постизања бољих перформанси у виду тачности класификације. Коначна класификација је извршена са моделом случајне шуме са којим се добијају најбоље перформансе у погледу тачности класификације. Укупна тачност (OA) и капа коефицијент слагања (κ) добијени десетоструком унакрсном валидацијом над тренинг подацима износили су 90.4% и 0.808. Применом истренираног модела на независном скупу података добијено је да је OA = 89.0% и κ = 0.757. На крају дисертације дате су смернице за даље истраживање и развој.Light Detection and Ranging – LiDAR technology has proven to be very successful for rapid collection of massive amounts of spatial data on the topography of the Earth's physical surface. Semantic segmentation of an Airborne Laser Scanning (ALS) point cloud, also called point cloud classification or semantic labeling as well as semantic point cloud classification is a major challenge due to the structure of the point cloud, as well as the types of classes that can be identified in that space. Machine learning (ML), on the other hand, represents a powerful mathematical tool that can be used for a variety of applications, including mentioned procedure. In this dissertation, ML methods are analyzed in the terms of achieving the best results for semantic segmentation of point cloud, especially with stacked ensemble ML models constructed by combining several fundamental ML methods. The ALS dataset was also balanced in such a way that points belonging to minority classes are synthetically generated while points belonging to the major classes are highly reduced. An analysis of the search type of neighborhood points and the sizes of the search radius were performed, and the possibility of using a multi–scale search approach in order to generate the geometric characteristics of the points. A large number of different features (attributes) of the points was determined and the selection of the features that are most significant for the semantic segmentation of the point cloud was carried out. Semantic segmentation of ALS point clouds was performed by using ten different ML methods. The highest overall accuracy of the semantic segmentation of the ALS point cloud was 83.5% for the support vector machine method predicted on the ISPRS test data, while the overall accuracy of 93.6% was achieved on the GRSS test ALS data when using the stacked ensemble model of naive Bayesian stacking of several ML models (Random Forest, Gradient Boosting and Logistic Regression). In some applications, segmentation also implies extraction of the objects of interest (the building roof, the tree crown, etc.). Within this research, the segmentation of the ALS point cloud of a forested area was analyzed with the aim of Individual Tree Detection (ITD). The mentioned approach involves Local Maxima (LM) filtering and segmentation of individual tree canopies by using the Canopy Height Model (CHM). Previous procedure is important for generation of different segment–level type of features that are used for later classification of treetops into correctly and incorrectly detected ones. The study was conducted for a mixed temperate forest, predominantly deciduous, with complex topography and area size of 0.6 km × 4 km. Classification model training was performed by five machine learning approaches: Random Forest (RF), Extreme Gradient Boosting (XGB), Artificial Neural Network (ANN), Support Vector Machine (SVM) and Logistic Regression (LR). Here, the classes of the dataset were also balanced in order to achieve better performance in the terms of classification accuracy. The final classification was performed with the random forest model, which gives the best performance in terms of classification accuracy. The Overall Accuracy (OA) and the kappa coefficient of agreement (κ) obtained from ten–fold cross validation for the training data were 90.4% X and 0.808, respectively. The application of the trained model on the independent set of data, resulted in OA = 89.0% and κ = 0.757. At the end of the dissertation, guidelines for further research and development are given

    An approach to the language discrimination in different scripts using adjacent local binary pattern

    No full text
    <p>The paper proposes a language discrimination method of documents. First, each letter is encoded with the certain script type according to its status in baseline area. Such a cipher text is subjected to a feature extraction process. Accordingly, the local binary pattern as well as its expanded version called adjacent local binary pattern are extracted. Because of the difference in the language characteristics, the above analysis shows significant diversity. This type of diversity is a key aspect in the decision-making differentiation of the languages. Proposed method is tested on an example of documents. The experiments give encouraging results.</p

    Script Characterization in the Old Slavic Documents

    No full text

    Medical Images Transform by Multistage PCA-Based Algorithm

    No full text
    corecore